Overview

Dataset Statistics

Number of Variables 4
Number of Rows 3321
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 458.7 KB
Average Row Size in Memory 141.4 B
Variable Types
  • Numerical: 1
  • Categorical: 3

Dataset Insights

ID is uniformly distributed Uniform
Gene has a high cardinality: 264 distinct values High Cardinality
Variation has a high cardinality: 2996 distinct values High Cardinality
Class has constant length 1 Constant Length

Variables


ID

numerical

Approximate Distinct Count 3321
Approximate Unique (%) 100.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 53136
Mean 1660
Minimum 0
Maximum 3320
Zeros 1
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • ID is uniformly distributed

Quantile Statistics

Minimum 0
5-th Percentile 166
Q1 830
Median 1660
Q3 2490
95-th Percentile 3154
Maximum 3320
Range 3320
IQR 1660

Descriptive Statistics

Mean 1660
Standard Deviation 958.8344
Variance 919363.5
Sum 5.5129e+06
Skewness 0
Kurtosis -1.2
Coefficient of Variation 0.5776
  • ID is not normally distributed (p-value 0.0)

Gene

categorical

Approximate Distinct Count 264
Approximate Unique (%) 7.9%
Missing 0
Missing (%) 0.0%
Memory Size 230859
  • The largest value (BRCA1) is over 1.62 times larger than the second largest value (TP53)

Length

Mean 4.5149
Standard Deviation 0.9486
Median 4
Minimum 2
Maximum 8

Sample

1st row FAM58A
2nd row CBL
3rd row CBL
4th row CBL
5th row CBL

Letter

Count 12336
Lowercase Letter 0
Space Separator 0
Uppercase Letter 12336
Dash Punctuation 9
Decimal Number 2649
  • The largest value (brca1) is over 1.62 times larger than the second largest value (tp53)

Variation

categorical

Approximate Distinct Count 2996
Approximate Unique (%) 90.2%
Missing 0
Missing (%) 0.0%
Memory Size 238756

Length

Mean 6.8928
Standard Deviation 4.4771
Median 5
Minimum 3
Maximum 55

Sample

1st row Truncating Mutatio...
2nd row W802*
3rd row Q249E
4th row N454D
5th row L399V

Letter

Count 12827
Lowercase Letter 5176
Space Separator 297
Uppercase Letter 7651
Dash Punctuation 151
Decimal Number 9394
  • Variation contains many words: 3010 words

Class

categorical

Approximate Distinct Count 9
Approximate Unique (%) 0.3%
Missing 0
Missing (%) 0.0%
Memory Size 219186

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 1
2nd row 2
3rd row 2
4th row 3
5th row 4

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 3321
  • Class has words of constant length

Interactions

Correlations

Missing Values